skip to main content


Search for: All records

Creators/Authors contains: "Singh, Mohit"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Given a set of facilities and clients, and costs to open facilities, the classic facility location problem seeks to open a set of facilities and assign each client to one open facility to minimize the cost of opening the chosen facilities and the total distance of the clients to their assigned open facilities. Such an objective may induce an unequal cost over certain socioeconomic groups of clients (i.e., total distance traveled by clients in such a group). This is important when planning the location of socially relevant facilities such as emergency rooms. In this work, we consider a fair version of the problem where we are given π‘Ÿ clients groups that partition the set of clients, and the distance of a given group is defined as the average distance of clients in the group to their respective open facilities. The objective is to minimize the Minkowski 𝑝-norm of vector of group distances, to penalize high access costs to open facilities across π‘Ÿ groups of clients. This generalizes classic facility location (𝑝 = 1) and the minimization of the maximum group distance (𝑝 = ∞). However, in practice, fairness criteria may not be explicit or even known to a decision maker, and it is often unclear how to select a specific "𝑝" to model the cost of unfairness. To get around this, we study the notion of solution portfolios where for a fixed problem instance, we seek a small portfolio of solutions such that for any Minkowski norm 𝑝, one of these solutions is an 𝑂(1)-approximation. Using the geometric relationship between various 𝑝-norms, we show the existence of a portfolio of cardinality 𝑂(log π‘Ÿ), and a lower bound of (\sqrt{log r}). There may not be common structure across different solutions in this portfolio, which can make planning difficult if the notion of fairness changes over time or if the budget to open facilities is disbursed over time. For example, small changes in 𝑝 could lead to a completely different set of open facilities in the portfolio. Inspired by this, we introduce the notion of refinement, which is a family of solutions for each 𝑝-norm satisfying a combinatorial property. This property requires that (1) the set of facilities open for a higher 𝑝-norm must be a subset of the facilities open for a lower 𝑝-norm, and (2) all clients assigned to an open facility for a lower 𝑝-norm must be assigned to the same open facility for any higher 𝑝-norm. A refinement is 𝛼-approximate if the solution for each 𝑝-norm problem is an 𝛼-approximation for it. We show that it is sufficient to consider only 𝑂(log π‘Ÿ) norms instead of all 𝑝-norms, 𝑝 ∈ [1, ∞] to construct refinements. A natural greedy algorithm for the problem gives a poly(π‘Ÿ)-approximate refinement, which we improve to poly(r^1/\sqrt{log π‘Ÿ})-approximate using a recursive algorithm. We improve this ratio to 𝑂(log π‘Ÿ) for the special case of tree metric for uniform facility open cost. Our recursive algorithm extends to other settings, including to a hierarchical facility location problem that models facility location problems at several levels, such as public works departments and schools. A full version of this paper can be found at https://arxiv.org/abs/2211.14873. 
    more » « less
    Free, publicly-accessible full text available July 7, 2024
  2. A graph profile records all possible densities of a fixed finite set of graphs. Profiles can be extremely complicated; for instance the full profile of any triple of connected graphs is not known, and little is known about hypergraph profiles. We introduce the tropicalization of graph and hypergraph profiles. Tropicalization is a well-studied operation in algebraic geometry, which replaces a variety (the set of real or complex solutions to a finite set of algebraic equations) with its β€œcombinatorial shadow”. We prove that the tropicalization of a graph profile is a closed convex cone, which still captures interesting combinatorial information. We explicitly compute these tropicalizations for arbitrary sets of complete and star hypergraphs. We show they are rational polyhedral cones even though the corresponding profiles are not even known to be semialgebraic in some of these cases. We then use tropicalization to prove strong restrictions on the power of the sums of squares method, equivalently Cauchy-Schwarz calculus, to test (which is weaker than certification) the validity of graph density inequalities. In particular, we show that sums of squares cannot test simple binomial graph density inequalities, or even their approximations. Small concrete examples of such inequalities are presented, and include the famous Blakley-Roy inequalities for paths of odd length. As a consequence, these simple inequalities cannot be written as a rational sum of squares of graph densities. 
    more » « less
  3. We study optimal design problems in which the goal is to choose a set of linear measurements to obtain the most accurate estimate of an unknown vector. We study the [Formula: see text]-optimal design variant where the objective is to minimize the average variance of the error in the maximum likelihood estimate of the vector being measured. We introduce the proportional volume sampling algorithm to obtain nearly optimal bounds in the asymptotic regime when the number [Formula: see text] of measurements made is significantly larger than the dimension [Formula: see text] and obtain the first approximation algorithms whose approximation factor does not degrade with the number of possible measurements when [Formula: see text] is small. The algorithm also gives approximation guarantees for other optimal design objectives such as [Formula: see text]-optimality and the generalized ratio objective, matching or improving the previously best-known results. We further show that bounds similar to ours cannot be obtained for [Formula: see text]-optimal design and that [Formula: see text]-optimal design is NP-hard to approximate within a fixed constant when [Formula: see text]. 
    more » « less
  4. Scrubbing sensitive data before releasing memory is a widely accepted but often ignored programming practice for developing secure software. Consequently, confidential data such as cryptographic keys, passwords, and personal data, can remain in memory indefinitely, thereby increasing the risk of exposure to hackers who can retrieve the data using memory dumps or exploit vulnerabilities such as Heartbleed and Etherleak. We propose an approach for detecting a specific memory safety bug called Improper Clearing of Heap Memory Before Release, also known as Common Weakness Enumeration 244, in C programs. The CWE-244 bug in a program allows the leakage of confidential information when a variable is not wiped before heap memory is freed. Our approach combines taint analysis and model checking to detect this weakness. We have three main phases: (1) perform a coarse flow-insensitive inter-procedural static analysis on the program to construct a set of pointer variables that could point to sensitive data; (2) instrument the program with required dynamic variable tracking, and assertion logic for memory wiping before deallocation; and (3) invoke a model checker, the C-Bounded Model Checker (CBMC) in our case, to detect assertion violation in the instrumented program. We develop a tool, \toolname, implementing our instrumentation based algorithm, and we provide experimental validation on the Juliet Test Suite --- the tool is able to detect all the CWE-244 instances present in the test suite. To the best of our knowledge, this is the first work which presents a solution to the problem of detecting unscrubbed secure memory deallocation violations in programs. 
    more » « less
  5. Constrained submodular function maximization has been used in subset selection problems such as selection of most informative sensor locations. Although these models have been quite popular, the solutions obtained via this approach are unstable to perturbations in data defining the submodular functions. Robust submodular maximization has been proposed as a richer model that aims to overcome this discrepancy as well as increase the modeling scope of submodular optimization. In this work, we consider robust submodular maximization with structured combinatorial constraints and give efficient algorithms with provable guarantees. Our approach is applicable to constraints defined by single or multiple matroids and knapsack as well as distributionally robust criteria. We consider both the offline setting where the data defining the problem are known in advance and the online setting where the input data are revealed over time. For the offline setting, we give a general (nearly) optimal bicriteria approximation algorithm that relies on new extensions of classical algorithms for submodular maximization. For the online version of the problem, we give an algorithm that returns a bicriteria solution with sublinear regret. Summary of Contribution: Constrained submodular maximization is one of the core areas in combinatorial optimization with a wide variety of applications in operations research and computer science. Over the last decades, both communities have been interested on the design and analysis of new algorithms with provable guarantees. Sensor location, influence maximization and data summarization are some of the applications of submodular optimization that lie at the intersection of the aforementioned communities. Particularly, our work focuses on optimizing several submodular functions simultaneously. We provide new insights and algorithms to the offline and online variants of the problem which significantly expand the related literature. At the same time, we provide a computational study that supports our theoretical results. 
    more » « less
  6. null (Ed.)
    Experimental design is a classical statistics problem, and its aim is to estimate an unknown vector from linear measurements where a Gaussian noise is introduced in each measurement. For the combinatorial experimental design problem, the goal is to pick a subset of experiments so as to make the most accurate estimate of the unknown parameters. In this paper, we will study one of the most robust measures of error estimationβ€”the D-optimality criterion, which corresponds to minimizing the volume of the confidence ellipsoid for the estimation error. The problem gives rise to two natural variants depending on whether repetitions of experiments are allowed or not. We first propose an approximation algorithm with a 1/e-approximation for the D-optimal design problem with and without repetitions, giving the first constant-factor approximation for the problem. We then analyze another sampling approximation algorithm and prove that it is asymptotically optimal. Finally, for D-optimal design with repetitions, we study a different algorithm proposed by the literature and show that it can improve this asymptotic approximation ratio. All the sampling algorithms studied in this paper are shown to admit polynomial-time deterministic implementations. 
    more » « less
  7. null (Ed.)